Skip to content

Improve cluster id/name handling#188

Merged
keitwb merged 1 commit intorh-ecosystem-edge:mainfrom
keitwb:cluster-id-map
Sep 10, 2025
Merged

Improve cluster id/name handling#188
keitwb merged 1 commit intorh-ecosystem-edge:mainfrom
keitwb:cluster-id-map

Conversation

@keitwb
Copy link
Copy Markdown
Contributor

@keitwb keitwb commented Sep 8, 2025

These changes to the prompt seem to help it both maintain the context of the current cluster id, as well as give it the ability to map a cluser name to a cluster ID to make it easier to work with, especially when picking up in a new conversation for a cluster install.

Summary by CodeRabbit

  • New Features

    • You can now request cluster details by name; the system will auto-map the name to the correct ID.
    • If multiple clusters share the same name, you’ll be asked to choose the correct one.
  • Bug Fixes

    • Reduced unnecessary follow-up questions by using known context.
    • Improved handling and messaging when a cluster name/ID doesn’t exist.
  • Tests

    • Updated and added evaluations to cover name-based cluster requests and name-to-ID mapping.

@openshift-ci openshift-ci bot requested review from jhernand and omertuc September 8, 2025 15:38
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Sep 8, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: keitwb
Once this PR has been reviewed and has the lgtm label, please assign eranco74 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Sep 8, 2025

Walkthrough

Updates the system prompt to handle cluster names by mapping them to IDs via listing clusters, and adjusts eval tests to reflect name-based lookup flows, including new scenarios for creating and querying clusters by name. No exported/public API changes.

Changes

Cohort / File(s) Summary of changes
System prompt updates
template.yaml
Revised guidance to “explicitly ask unless you already know it.” Added rule to map cluster name to ID via listing clusters and exact-match selection; request clarification on multiple exact matches.
Eval tests for cluster name→ID flow
test/evals/eval_data.yaml
Switched cluster_info eval from direct ID to name-based lookup using list_clusters. Added new evals under cluster_id_from_name: cluster creation and info retrieval by name; updated expected tool calls, keywords, and descriptions. Minor formatting tweaks.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Assistant
  participant Tool_list as Tool: list_clusters
  participant Tool_info as Tool: cluster_info

  User->>Assistant: "Show info for cluster 'abc123'"
  Assistant->>Tool_list: list_clusters()
  Tool_list-->>Assistant: clusters[]
  alt Exactly one exact-name match
    Assistant->>Tool_info: cluster_info(cluster_id=matched_id)
    Tool_info-->>Assistant: cluster details
    Assistant-->>User: Cluster details
  else Multiple exact-name matches
    Assistant-->>User: Ask to clarify which cluster (IDs shown)
  else No exact-name match
    Assistant-->>User: Report not found / ask for ID
  end
  note over Assistant: New flow: name→ID mapping via list_clusters before cluster_info
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

lgtm, ok-to-test

Suggested reviewers

  • omertuc
  • carbonin
  • eranco74

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Improve cluster id/name handling” succinctly and accurately captures the core change of adding name-to-ID mapping and maintaining cluster ID context without extraneous detail, making it clear and focused on the main enhancement in the changeset.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
template.yaml (1)

433-433: Parameter expansion bug: ${{SERVICE_PORT}} won’t be substituted by OpenShift Templates.

OpenShift Template params use ${PARAM}, not ${{PARAM}}. These will render literally and break probes/ports.

-    replicas: ${{REPLICAS_COUNT}}
+    replicas: ${REPLICAS_COUNT}
@@
-            containerPort: ${{SERVICE_PORT}}
+            containerPort: ${SERVICE_PORT}
@@
-              port: ${{SERVICE_PORT}}
+              port: ${SERVICE_PORT}
@@
-              port: ${{SERVICE_PORT}}
+              port: ${SERVICE_PORT}
@@
-      port: ${{SERVICE_PORT}}
-      targetPort: ${{SERVICE_PORT}}
+      port: ${SERVICE_PORT}
+      targetPort: ${SERVICE_PORT}

Also applies to: 449-449, 513-514, 521-521, 601-602

🧹 Nitpick comments (2)
template.yaml (2)

336-341: Verify placeholder API key; never commit real keys in ConfigMap.

This “dummy” key is fine if it’s truly a placeholder. If a real key is ever required, move it to a Secret and reference via env/subPath.


131-133: Spelling inconsistency: LIGHTSSPEED vs lightspeed.

Param/usage are consistent, but the misspelling can confuse operators. Consider renaming to LIGHTSPEED_STACK_POSTGRES_SSL_MODE (with a transitional alias) for clarity.

Also applies to: 198-201

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a442df and 415e9d3.

📒 Files selected for processing (1)
  • template.yaml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
  • GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request

Comment thread template.yaml
@oourfali
Copy link
Copy Markdown

oourfali commented Sep 8, 2025

/lgtm

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
test/evals/eval_data.yaml (4)

128-128: Keyword “here” is brittle.

Substring “here” is conversational and unstable; prefer a domain term (e.g., “list”) or multiple acceptable keywords to reduce flaky evals.

-      expected_keywords: ["clusters", "here"]
+      expected_keywords: ["clusters", "list", "Here are your clusters"]

156-157: Reduce coupling to exact refusal phrasing.

These substrings are tied to current prompt text. Consider more general refusal markers plus “helpful redirection” to survive prompt tweaks.

-      expected_keywords: ["internal system details", "help you with your OpenShift task"]
+      expected_keywords: ["cannot share", "internal", "OpenShift task", "help"]

160-160: Typo/mismatch in eval_id.

“create_single_node_cluser” has a typo and the step creates a multi-node cluster. Rename for clarity.

-    - eval_id: create_single_node_cluser
+    - eval_id: create_multinode_cluster_by_name

167-173: Allow either cached-ID or list→lookup path.

Some agents may reuse the cluster_id returned by create_cluster and skip list_clusters, causing a false negative. If supported, accept either sequence.

Example (adjust to your schema): accept one of two sequences.

expected_tool_calls_one_of:
  - - tool_name: list_clusters
      arguments: {}
    - tool_name: cluster_info
      arguments:
        cluster_id: "[A-Fa-f0-9]{8}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{12}"
  - - tool_name: cluster_info
      arguments:
        cluster_id: "[A-Fa-f0-9-]{36}"

Also make the UUID regex case-insensitive:

-              cluster_id: "[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}"
+              cluster_id: "[A-Fa-f0-9]{8}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{12}"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 415e9d3 and feb2170.

📒 Files selected for processing (3)
  • assisted-chat-pod.yaml (1 hunks)
  • template.yaml (1 hunks)
  • test/evals/eval_data.yaml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • template.yaml
🔇 Additional comments (2)
assisted-chat-pod.yaml (1)

6-6: DNS policy switch: validate no reliance on custom dnsConfig.

Changing to Default drops any prior nameserver/search/option overrides. If earlier behavior depended on custom search domains or non-cluster resolvers, name resolution may change.

Do you still resolve any in-cluster service names (e.g., "lightspeed-stack") as expected without an explicit dnsConfig? If not, restore only the needed bits:

 spec:
   dnsPolicy: "Default"
+  dnsConfig:
+    options:
+      - name: ndots
+        value: "5"
test/evals/eval_data.yaml (1)

133-141: Name-based lookup flow LGTM; ensure negative case is deterministic.

Good: forces a list before attempting resolution. Confirm dataset guarantees “abc123” doesn’t exist; otherwise the agent may proceed to cluster_info and fail this eval.

Comment thread test/evals/eval_data.yaml
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
test/evals/eval_data.yaml (1)

1-175: Fix typo in eval_id create_single_node_cluser
Rename it to create_single_node_cluster in test/evals/eval_data.yaml.

♻️ Duplicate comments (1)
test/evals/eval_data.yaml (1)

132-132: Duplicate eval_id previously flagged appears resolved—nice.

Past review mentioned a duplicate “cluster_info_tool_call”; it’s now unique. Nothing to change.

🧹 Nitpick comments (3)
test/evals/eval_data.yaml (3)

133-141: Non-existent name case: assert an explicit “not found” signal.

You list clusters but don’t assert the response acknowledges the miss. Recommend adding a not-found cue to reduce false positives.

Apply:

-      expected_keywords: ["cluster", "abc123"]
-      description: Test error handling for non-existent cluster ID/Name
+      expected_keywords: ["cluster", "abc123", "not found"]
+      description: Test error handling for non-existent cluster name (no match)

158-174: Fix eval_id typo and align naming with multi-node request.

  • Typo: “cluser” → “cluster”.
  • The eval_id says “single_node” while the prompt creates a multi-node cluster—rename for clarity.
  • Optional: For the first step, assert the create_cluster tool call to make the second step deterministic.

Apply:

-    - eval_id: create_single_node_cluser
+    - eval_id: create_multinode_cluster
@@
-      eval_types: [response_eval:sub-string]
-      expected_keywords: ["cluster", "eval-test-cluster-name", "cluster ID"]
+      eval_types: [tool_eval, response_eval:sub-string]
+      expected_tool_calls:
+        - - tool_name: create_cluster
+            arguments:
+              name: "eval-test-cluster-name"
+              version: "4\\.18\\.22"
+              base_domain: "test\\.local"
+              single_node: "(?i:false)"
+              cpu_architecture: "x86_64"
+              ssh_public_key: ""
+      expected_keywords: ["cluster", "eval-test-cluster-name", "cluster ID"]

If you prefer to keep step 1 lightweight, at minimum correct the typo:

-    - eval_id: create_single_node_cluser
+    - eval_id: create_single_node_cluster

170-173: Reuse UUID regex via YAML anchor to DRY repeated patterns.

You repeat the UUID pattern in multiple places; define once and reference for consistency.

Example (outside this hunk):

# near top of file
uuid_re: &uuid_re "[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}"

Then here:

-              cluster_id: "[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}"
+              cluster_id: *uuid_re
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between feb2170 and 40cc435.

📒 Files selected for processing (3)
  • assisted-chat-pod.yaml (1 hunks)
  • template.yaml (1 hunks)
  • test/evals/eval_data.yaml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • assisted-chat-pod.yaml
  • template.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Red Hat Konflux / assisted-chat-test-image-saas-main-on-pull-request
  • GitHub Check: Red Hat Konflux / assisted-chat-saas-main-on-pull-request
🔇 Additional comments (2)
test/evals/eval_data.yaml (2)

156-156: LGTM: clearer refusal keywords.

The updated keywords capture the safety refusal and redirect intent succinctly.


128-128: Verify assistant response prefix matches expected “here”
Ensure that the actual assistant reply to “Show me all my clusters” begins with “Here …” (to satisfy the "here" substring check); if it still uses “list,” update the prompt template or adjust expected_keywords accordingly.

This attempts to make cluster names more accepted so that
users don't have to always specify the cluster id. This is
especially useful when starting a new conversation about a cluster
without any prior context.

Also changed the verbage around reusing information to make it more
inclined to reuse clusters between turns.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
test/evals/eval_data.yaml (3)

171-172: Nit: wording consistency in description.

Prefer “bare metal” over “baremetal” for consistency with earlier text.

-      description: Ensure the assistant concisely states supported on-prem baremetal/VM scope and references Assisted Installer
+      description: Ensure the assistant concisely states supported on-prem bare metal/VM scope and references Assisted Installer

180-189: Make the tool-call expectation resilient to context reuse.

Since the cluster is created just above, the agent may already hold the ID and directly call cluster_info without list_clusters. Consider allowing either path to reduce flakes.

If the harness supports alternatives/ordering flexibility, update this block accordingly (example shows relaxing list step):

-      expected_tool_calls:
-        - - tool_name: list_clusters
-            arguments: {}
-        - - tool_name: cluster_info
-            arguments:
-              cluster_id: "[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}"
+      # Accept either: (a) list_clusters -> cluster_info, or (b) direct cluster_info when ID is already in context
+      expected_tool_calls:
+        - - tool_name: cluster_info
+            arguments:
+              cluster_id: "[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}"
+        # Optional pre-step if the agent needs to resolve the name:
+        # - - tool_name: list_clusters
+        #     arguments: {}

If alternatives aren’t supported, flip the order tolerance or mark list_clusters as optional in your harness settings.


173-189: Guard against name collisions across runs.

Static cluster name ‘eval-test-cluster-name’ can collide in parallel/leftover environments. Prefer a unique suffix or a pre-test cleanup.

I can propose a helper to inject a short random/CI-build suffix into the name (and adjust expected_keywords accordingly) or add a “delete if exists” prelude in the harness—want me to draft it?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40cc435 and 13775a1.

📒 Files selected for processing (2)
  • template.yaml (1 hunks)
  • test/evals/eval_data.yaml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • template.yaml
🔇 Additional comments (1)
test/evals/eval_data.yaml (1)

1-190: All eval_id values are unique; no duplicates detected.

Comment thread test/evals/eval_data.yaml
Comment on lines +132 to +139
eval_query: Give me details about cluster named abc123
eval_types: [tool_eval, response_eval:sub-string]
expected_tool_calls:
- - tool_name: cluster_info
arguments:
cluster_id: "abc123"
expected_keywords: ["cluster", "abc123", "error", "not found"]
description: Test error handling for non-existent cluster ID
# It should list the clusters to try to match up the name
- - tool_name: list_clusters
arguments: {}
expected_keywords: ["cluster", "abc123"]
description: Test error handling for non-existent cluster ID/Name
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Strengthen “not found” assertion for name/ID miss.

As written, a hallucinated success mentioning “cluster abc123” could pass. Tighten the negative-case by requiring an explicit “not found” (or equivalent) indicator in the response.

Apply:

-      expected_keywords: ["cluster", "abc123"]
+      expected_keywords: ["cluster", "abc123", "not found"]
+      # Alternatives you might accept instead of "not found":
+      # "does not exist", "no cluster", "cannot find"
🤖 Prompt for AI Agents
In test/evals/eval_data.yaml around lines 132-139, the negative-case check is
too weak (a response that merely repeats "cluster abc123" could pass); update
the test to require an explicit "not found" indicator by adding an assertion
that the model response contains a not-found phrase (e.g., "not found", "does
not exist", "no cluster found") — either add that phrase to expected_keywords or
add a new field like expected_response_contains/expected_negative_keywords with
one or more canonical not-found strings and ensure the eval_type includes a
response_eval that checks for those phrases.

Comment thread test/evals/eval_data.yaml
Comment on lines +175 to +179
- eval_id: create_single_node_cluser
eval_query: Create a multi-node cluster named 'eval-test-cluster-name' with OpenShift 4.18.22 and domain test.local. I do not have an SSH key to provide.
eval_types: [response_eval:sub-string]
expected_keywords: ["cluster", "eval-test-cluster-name", "cluster ID"]
- eval_id: cluster_name_tool_call
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix eval_id typo and mismatch with scenario.

  • Typo: “cluser” → “cluster”.
  • The eval_id says “single_node” but the query creates a multi-node cluster. Rename for clarity.
-    - eval_id: create_single_node_cluser
+    - eval_id: create_multinode_cluster
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- eval_id: create_single_node_cluser
eval_query: Create a multi-node cluster named 'eval-test-cluster-name' with OpenShift 4.18.22 and domain test.local. I do not have an SSH key to provide.
eval_types: [response_eval:sub-string]
expected_keywords: ["cluster", "eval-test-cluster-name", "cluster ID"]
- eval_id: cluster_name_tool_call
- eval_id: create_multinode_cluster
eval_query: Create a multi-node cluster named 'eval-test-cluster-name' with OpenShift 4.18.22 and domain test.local. I do not have an SSH key to provide.
eval_types: [response_eval:sub-string]
expected_keywords: ["cluster", "eval-test-cluster-name", "cluster ID"]
- eval_id: cluster_name_tool_call
🤖 Prompt for AI Agents
In test/evals/eval_data.yaml around lines 175 to 179, there is a typo and a
misleading eval_id: change "create_single_node_cluser" to a clear, correct
identifier that matches the scenario (e.g., "create_multi_node_cluster") and
correct the spelling of "cluser" to "cluster"; ensure any references to this
eval_id elsewhere in the file or test suite are updated to the new name so they
remain consistent.

@eranco74
Copy link
Copy Markdown
Collaborator

/lgtm

@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Sep 10, 2025

@keitwb: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/eval-test 13775a1 link false /test eval-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@keitwb keitwb merged commit c868042 into rh-ecosystem-edge:main Sep 10, 2025
5 of 7 checks passed
@keitwb keitwb deleted the cluster-id-map branch September 10, 2025 13:50
andrej1991 pushed a commit to andrej1991/assisted-chat that referenced this pull request Sep 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants